Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 19812 |
| Missing cells | 17184 |
| Missing cells (%) | 4.3% |
| Duplicate rows | 557 |
| Duplicate rows (%) | 2.8% |
| Total size in memory | 2.2 MiB |
| Average record size in memory | 118.0 B |
Variable types
| NUM | 9 |
|---|---|
| BOOL | 7 |
| CAT | 4 |
nr_facades has constant value "19812" | Constant |
| Dataset has 557 (2.8%) duplicate rows | Duplicates |
basement is highly correlated with land | High correlation |
land is highly correlated with basement | High correlation |
type_subproperty is highly correlated with type_property | High correlation |
type_property is highly correlated with type_subproperty | High correlation |
type_of_sale has 12992 (65.6%) missing values | Missing |
building has 4192 (21.2%) missing values | Missing |
netHabitableSurface is highly skewed (γ1 = 51.83050046) | Skewed |
nr_bedrooms is highly skewed (γ1 = 25.94489667) | Skewed |
garden_m2 is highly skewed (γ1 = 29.79909112) | Skewed |
land is highly skewed (γ1 = 20.56953032) | Skewed |
basement is highly skewed (γ1 = 21.31584435) | Skewed |
netHabitableSurface has 2728 (13.8%) zeros | Zeros |
nr_bedrooms has 2492 (12.6%) zeros | Zeros |
garden_m2 has 16425 (82.9%) zeros | Zeros |
terrace_m2 has 12429 (62.7%) zeros | Zeros |
land has 10685 (53.9%) zeros | Zeros |
basement has 1439 (7.3%) zeros | Zeros |
Reproduction
| Analysis started | 2020-11-19 10:52:47.189232 |
|---|---|
| Analysis finished | 2020-11-19 10:53:09.380902 |
| Duration | 22.19 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
Unnamed: 0
Real number (ℝ≥0)
| Distinct | 19253 |
|---|---|
| Distinct (%) | 97.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8727770.808 |
|---|---|
| Minimum | 2884976 |
| Maximum | 8957547 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 2884976 |
|---|---|
| 5-th percentile | 8082832.1 |
| Q1 | 8693416.25 |
| median | 8839908.5 |
| Q3 | 8911999.25 |
| 95-th percentile | 8948774.05 |
| Maximum | 8957547 |
| Range | 6072571 |
| Interquartile range (IQR) | 218583 |
Descriptive statistics
| Standard deviation | 335383.9802 |
|---|---|
| Coefficient of variation (CV) | 0.03842722129 |
| Kurtosis | 33.79564081 |
| Mean | 8727770.808 |
| Median Absolute Deviation (MAD) | 86831.5 |
| Skewness | -4.298438317 |
| Sum | 1.729145952e+11 |
| Variance | 1.124824142e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 8835029 | 2 | < 0.1% | |
| 8893036 | 2 | < 0.1% | |
| 8944001 | 2 | < 0.1% | |
| 8827826 | 2 | < 0.1% | |
| 8922648 | 2 | < 0.1% | |
| 8254795 | 2 | < 0.1% | |
| 8948959 | 2 | < 0.1% | |
| 8459484 | 2 | < 0.1% | |
| 8881245 | 2 | < 0.1% | |
| 8884365 | 2 | < 0.1% | |
| Other values (19243) | 19792 | 99.9% |
| Value | Count | Frequency (%) | |
| 2884976 | 1 | < 0.1% | |
| 3816754 | 1 | < 0.1% | |
| 3855973 | 1 | < 0.1% | |
| 3880650 | 1 | < 0.1% | |
| 3993780 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 8957547 | 1 | < 0.1% | |
| 8957537 | 1 | < 0.1% | |
| 8957535 | 1 | < 0.1% | |
| 8957516 | 1 | < 0.1% | |
| 8957515 | 1 | < 0.1% |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 154.8 KiB |
| HOUSE | |
|---|---|
| APARTMENT | |
| COMMERCIAL | |
| INDUSTRY | 285 |
| OFFICE | 275 |
| Other values (3) | 248 |
| Value | Count | Frequency (%) | |
| HOUSE | 9309 | 47.0% | |
| APARTMENT | 8582 | 43.3% | |
| COMMERCIAL | 1113 | 5.6% | |
| INDUSTRY | 285 | 1.4% | |
| OFFICE | 275 | 1.4% | |
| OTHER | 202 | 1.0% | |
| GARAGE | 43 | 0.2% | |
| LAND | 3 | < 0.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 8 |
| Mean length | 7.072632748 |
| Min length | 4 |
| Distinct | 49 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 154.8 KiB |
| APARTMENT | |
|---|---|
| HOUSE | |
| APARTMENT_BLOCK | |
| MIXED_USE_BUILDING | |
| VILLA | |
| Other values (44) |
| Value | Count | Frequency (%) | |
| APARTMENT | 6472 | 32.7% | |
| HOUSE | 4602 | 23.2% | |
| APARTMENT_BLOCK | 1583 | 8.0% | |
| MIXED_USE_BUILDING | 1388 | 7.0% | |
| VILLA | 990 | 5.0% | |
| MIXED_USE_BUILDING_COMMERCIAL | 723 | 3.6% | |
| DUPLEX | 580 | 2.9% | |
| PENTHOUSE | 526 | 2.7% | |
| GROUND_FLOOR | 419 | 2.1% | |
| FLAT_STUDIO | 302 | 1.5% | |
| Other values (39) | 2227 | 11.2% |
Frequencies of value counts
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 29 |
|---|---|
| Median length | 9 |
| Mean length | 10.27422774 |
| Min length | 3 |
price
Real number (ℝ≥0)
| Distinct | 2062 |
|---|---|
| Distinct (%) | 10.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 504305.0275 |
|---|---|
| Minimum | 0 |
| Maximum | 15000000 |
| Zeros | 30 |
| Zeros (%) | 0.2% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 121550 |
| Q1 | 220000 |
| median | 325000 |
| Q3 | 535000 |
| 95-th percentile | 1495000 |
| Maximum | 15000000 |
| Range | 15000000 |
| Interquartile range (IQR) | 315000 |
Descriptive statistics
| Standard deviation | 624065.9008 |
|---|---|
| Coefficient of variation (CV) | 1.237477056 |
| Kurtosis | 58.74828977 |
| Mean | 504305.0275 |
| Median Absolute Deviation (MAD) | 130000 |
| Skewness | 5.736459063 |
| Sum | 9991291204 |
| Variance | 3.894582485e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 295000 | 234 | 1.2% | |
| 299000 | 224 | 1.1% | |
| 275000 | 222 | 1.1% | |
| 199000 | 220 | 1.1% | |
| 395000 | 212 | 1.1% | |
| 225000 | 202 | 1.0% | |
| 249000 | 200 | 1.0% | |
| 250000 | 181 | 0.9% | |
| 495000 | 172 | 0.9% | |
| 325000 | 171 | 0.9% | |
| Other values (2052) | 17774 | 89.7% |
| Value | Count | Frequency (%) | |
| 0 | 30 | 0.2% | |
| 700 | 1 | < 0.1% | |
| 2500 | 4 | < 0.1% | |
| 3000 | 1 | < 0.1% | |
| 6000 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 15000000 | 1 | < 0.1% | |
| 12700000 | 1 | < 0.1% | |
| 11500000 | 1 | < 0.1% | |
| 10000000 | 1 | < 0.1% | |
| 9500000 | 1 | < 0.1% |
locality
Real number (ℝ≥0)
| Distinct | 941 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5142.060267 |
|---|---|
| Minimum | 1000 |
| Maximum | 9992 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 1050 |
| Q1 | 1830 |
| median | 4870 |
| Q3 | 8380 |
| 95-th percentile | 9470 |
| Maximum | 9992 |
| Range | 8992 |
| Interquartile range (IQR) | 6550 |
Descriptive statistics
| Standard deviation | 3150.37723 |
|---|---|
| Coefficient of variation (CV) | 0.6126682821 |
| Kurtosis | -1.600802044 |
| Mean | 5142.060267 |
| Median Absolute Deviation (MAD) | 3430 |
| Skewness | 0.02093949828 |
| Sum | 101874498 |
| Variance | 9924876.689 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 8300 | 771 | 3.9% | |
| 1180 | 559 | 2.8% | |
| 9000 | 482 | 2.4% | |
| 1000 | 464 | 2.3% | |
| 1050 | 412 | 2.1% | |
| 8400 | 281 | 1.4% | |
| 4000 | 257 | 1.3% | |
| 1070 | 229 | 1.2% | |
| 8000 | 215 | 1.1% | |
| 2000 | 212 | 1.1% | |
| Other values (931) | 15930 | 80.4% |
| Value | Count | Frequency (%) | |
| 1000 | 464 | 2.3% | |
| 1020 | 75 | 0.4% | |
| 1030 | 190 | 1.0% | |
| 1040 | 114 | 0.6% | |
| 1050 | 412 | 2.1% |
| Value | Count | Frequency (%) | |
| 9992 | 3 | < 0.1% | |
| 9991 | 11 | 0.1% | |
| 9990 | 29 | 0.1% | |
| 9988 | 3 | < 0.1% | |
| 9981 | 1 | < 0.1% |
| Distinct | 862 |
|---|---|
| Distinct (%) | 4.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 197.5509287 |
|---|---|
| Minimum | 0 |
| Maximum | 50000 |
| Zeros | 2728 |
| Zeros (%) | 13.8% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 80 |
| median | 130 |
| Q3 | 220 |
| 95-th percentile | 521 |
| Maximum | 50000 |
| Range | 50000 |
| Interquartile range (IQR) | 140 |
Descriptive statistics
| Standard deviation | 521.8963402 |
|---|---|
| Coefficient of variation (CV) | 2.641831874 |
| Kurtosis | 4396.795598 |
| Mean | 197.5509287 |
| Median Absolute Deviation (MAD) | 65 |
| Skewness | 51.83050046 |
| Sum | 3913879 |
| Variance | 272375.7899 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 2728 | 13.8% | |
| 120 | 307 | 1.5% | |
| 100 | 293 | 1.5% | |
| 150 | 283 | 1.4% | |
| 90 | 274 | 1.4% | |
| 200 | 267 | 1.3% | |
| 160 | 243 | 1.2% | |
| 80 | 230 | 1.2% | |
| 110 | 227 | 1.1% | |
| 180 | 222 | 1.1% | |
| Other values (852) | 14738 | 74.4% |
| Value | Count | Frequency (%) | |
| 0 | 2728 | 13.8% | |
| 5 | 2 | < 0.1% | |
| 15 | 4 | < 0.1% | |
| 16 | 4 | < 0.1% | |
| 17 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 50000 | 1 | < 0.1% | |
| 21000 | 1 | < 0.1% | |
| 15511 | 1 | < 0.1% | |
| 10837 | 1 | < 0.1% | |
| 10000 | 2 | < 0.1% |
| Distinct | 49 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.894306481 |
|---|---|
| Minimum | 0 |
| Maximum | 204 |
| Zeros | 2492 |
| Zeros (%) | 12.6% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 204 |
| Range | 204 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 3.912543218 |
|---|---|
| Coefficient of variation (CV) | 1.351806813 |
| Kurtosis | 1151.16406 |
| Mean | 2.894306481 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 25.94489667 |
| Sum | 57342 |
| Variance | 15.30799444 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=49)
| Value | Count | Frequency (%) | |
| 2 | 5314 | 26.8% | |
| 3 | 5094 | 25.7% | |
| 0 | 2492 | 12.6% | |
| 4 | 2435 | 12.3% | |
| 1 | 1788 | 9.0% | |
| 5 | 1212 | 6.1% | |
| 6 | 637 | 3.2% | |
| 7 | 260 | 1.3% | |
| 8 | 169 | 0.9% | |
| 9 | 100 | 0.5% | |
| Other values (39) | 311 | 1.6% |
| Value | Count | Frequency (%) | |
| 0 | 2492 | 12.6% | |
| 1 | 1788 | 9.0% | |
| 2 | 5314 | 26.8% | |
| 3 | 5094 | 25.7% | |
| 4 | 2435 | 12.3% |
| Value | Count | Frequency (%) | |
| 204 | 3 | < 0.1% | |
| 100 | 1 | < 0.1% | |
| 99 | 1 | < 0.1% | |
| 90 | 2 | < 0.1% | |
| 80 | 1 | < 0.1% |
kitchen_installed
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) | |
| True | 14379 | 72.6% | |
| False | 5433 | 27.4% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 154.8 KiB |
| 0 |
|---|
| Value | Count | Frequency (%) | |
| 0 | 19812 | 100.0% |
hasGarden
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) | |
| False | 16425 | 82.9% | |
| True | 3387 | 17.1% |
| Distinct | 769 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 164.5301333 |
|---|---|
| Minimum | 0 |
| Maximum | 94000 |
| Zeros | 16425 |
| Zeros (%) | 82.9% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 510 |
| Maximum | 94000 |
| Range | 94000 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1681.836659 |
|---|---|
| Coefficient of variation (CV) | 10.22205857 |
| Kurtosis | 1149.066964 |
| Mean | 164.5301333 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 29.79909112 |
| Sum | 3259671 |
| Variance | 2828574.547 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 16425 | 82.9% | |
| 100 | 110 | 0.6% | |
| 50 | 83 | 0.4% | |
| 200 | 79 | 0.4% | |
| 300 | 69 | 0.3% | |
| 150 | 62 | 0.3% | |
| 500 | 59 | 0.3% | |
| 60 | 59 | 0.3% | |
| 40 | 59 | 0.3% | |
| 400 | 58 | 0.3% | |
| Other values (759) | 2749 | 13.9% |
| Value | Count | Frequency (%) | |
| 0 | 16425 | 82.9% | |
| 1 | 58 | 0.3% | |
| 2 | 2 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 94000 | 1 | < 0.1% | |
| 75000 | 1 | < 0.1% | |
| 63000 | 1 | < 0.1% | |
| 58000 | 1 | < 0.1% | |
| 55000 | 2 | < 0.1% |
hasTerrace
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) | |
| True | 11602 | 58.6% | |
| False | 8210 | 41.4% |
| Distinct | 177 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.50585504 |
|---|---|
| Minimum | 0 |
| Maximum | 1383 |
| Zeros | 12429 |
| Zeros (%) | 62.7% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 12 |
| 95-th percentile | 50 |
| Maximum | 1383 |
| Range | 1383 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 27.22180594 |
|---|---|
| Coefficient of variation (CV) | 2.591108086 |
| Kurtosis | 388.4119651 |
| Mean | 10.50585504 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 11.83684498 |
| Sum | 208142 |
| Variance | 741.0267185 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 12429 | 62.7% | |
| 10 | 435 | 2.2% | |
| 20 | 425 | 2.1% | |
| 15 | 355 | 1.8% | |
| 8 | 327 | 1.7% | |
| 6 | 305 | 1.5% | |
| 12 | 287 | 1.4% | |
| 30 | 285 | 1.4% | |
| 25 | 261 | 1.3% | |
| 9 | 258 | 1.3% | |
| Other values (167) | 4445 | 22.4% |
| Value | Count | Frequency (%) | |
| 0 | 12429 | 62.7% | |
| 1 | 29 | 0.1% | |
| 2 | 113 | 0.6% | |
| 3 | 169 | 0.9% | |
| 4 | 210 | 1.1% |
| Value | Count | Frequency (%) | |
| 1383 | 1 | < 0.1% | |
| 708 | 1 | < 0.1% | |
| 495 | 1 | < 0.1% | |
| 450 | 2 | < 0.1% | |
| 400 | 3 | < 0.1% |
furnished_YN
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| False | |
|---|---|
| True | 628 |
| Value | Count | Frequency (%) | |
| False | 19184 | 96.8% | |
| True | 628 | 3.2% |
swimpool_YN
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| False | |
|---|---|
| True | 482 |
| Value | Count | Frequency (%) | |
| False | 19330 | 97.6% | |
| True | 482 | 2.4% |
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 12992 |
| Missing (%) | 65.6% |
| Memory size | 154.8 KiB |
| isNewClassified | |
|---|---|
| isNewlyBuilt | |
| isUnderOption | |
| isAnInteractiveSale | |
| isNewPrice | |
| Other values (2) | 195 |
| Value | Count | Frequency (%) | |
| isNewClassified | 2363 | 11.9% | |
| isNewlyBuilt | 1774 | 9.0% | |
| isUnderOption | 1654 | 8.3% | |
| isAnInteractiveSale | 429 | 2.2% | |
| isNewPrice | 405 | 2.0% | |
| isNotarySale | 168 | 0.8% | |
| isSoldOrRented | 27 | 0.1% | |
| (Missing) | 12992 | 65.6% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 19 |
|---|---|
| Median length | 3 |
| Mean length | 6.652836665 |
| Min length | 3 |
| Distinct | 2166 |
|---|---|
| Distinct (%) | 10.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 709.4122754 |
|---|---|
| Minimum | 0 |
| Maximum | 220000 |
| Zeros | 10685 |
| Zeros (%) | 53.9% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 350 |
| 95-th percentile | 2300 |
| Maximum | 220000 |
| Range | 220000 |
| Interquartile range (IQR) | 350 |
Descriptive statistics
| Standard deviation | 4358.677865 |
|---|---|
| Coefficient of variation (CV) | 6.144068853 |
| Kurtosis | 622.8964832 |
| Mean | 709.4122754 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 20.56953032 |
| Sum | 14054876 |
| Variance | 18998072.73 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 10685 | 53.9% | |
| 150 | 95 | 0.5% | |
| 100 | 94 | 0.5% | |
| 120 | 81 | 0.4% | |
| 200 | 72 | 0.4% | |
| 110 | 70 | 0.4% | |
| 70 | 65 | 0.3% | |
| 300 | 64 | 0.3% | |
| 160 | 61 | 0.3% | |
| 1000 | 55 | 0.3% | |
| Other values (2156) | 8470 | 42.8% |
| Value | Count | Frequency (%) | |
| 0 | 10685 | 53.9% | |
| 1 | 21 | 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 220000 | 1 | < 0.1% | |
| 150000 | 1 | < 0.1% | |
| 120000 | 1 | < 0.1% | |
| 117800 | 1 | < 0.1% | |
| 110000 | 1 | < 0.1% |
| Distinct | 2085 |
|---|---|
| Distinct (%) | 10.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 720.5367454 |
|---|---|
| Minimum | 0 |
| Maximum | 220000 |
| Zeros | 1439 |
| Zeros (%) | 7.3% |
| Memory size | 154.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 73 |
| median | 125 |
| Q3 | 340 |
| 95-th percentile | 2139 |
| Maximum | 220000 |
| Range | 220000 |
| Interquartile range (IQR) | 267 |
Descriptive statistics
| Standard deviation | 4282.219668 |
|---|---|
| Coefficient of variation (CV) | 5.943096858 |
| Kurtosis | 663.1824817 |
| Mean | 720.5367454 |
| Median Absolute Deviation (MAD) | 85 |
| Skewness | 21.31584435 |
| Sum | 14275274 |
| Variance | 18337405.28 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 1439 | 7.3% | |
| 100 | 301 | 1.5% | |
| 90 | 268 | 1.4% | |
| 70 | 251 | 1.3% | |
| 80 | 240 | 1.2% | |
| 120 | 227 | 1.1% | |
| 110 | 219 | 1.1% | |
| 85 | 209 | 1.1% | |
| 75 | 191 | 1.0% | |
| 60 | 172 | 0.9% | |
| Other values (2075) | 16295 | 82.2% |
| Value | Count | Frequency (%) | |
| 0 | 1439 | 7.3% | |
| 1 | 49 | 0.2% | |
| 2 | 88 | 0.4% | |
| 3 | 63 | 0.3% | |
| 4 | 81 | 0.4% |
| Value | Count | Frequency (%) | |
| 220000 | 1 | < 0.1% | |
| 150000 | 1 | < 0.1% | |
| 120000 | 1 | < 0.1% | |
| 117800 | 1 | < 0.1% | |
| 110000 | 1 | < 0.1% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 4192 |
| Missing (%) | 21.2% |
| Memory size | 154.8 KiB |
| AS_NEW | |
|---|---|
| GOOD | |
| TO_BE_DONE_UP | |
| TO_RENOVATE | |
| JUST_RENOVATED | |
| Other values (2) |
| Value | Count | Frequency (%) | |
| AS_NEW | 5580 | 28.2% | |
| GOOD | 5344 | 27.0% | |
| TO_BE_DONE_UP | 1375 | 6.9% | |
| TO_RENOVATE | 1191 | 6.0% | |
| JUST_RENOVATED | 1073 | 5.4% | |
| Not specified | 953 | 4.8% | |
| TO_RESTORE | 104 | 0.5% | |
| (Missing) | 4192 | 21.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 14 |
|---|---|
| Median length | 6 |
| Mean length | 6.403139511 |
| Min length | 3 |
fireplaceExist
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.3 KiB |
| False | |
|---|---|
| True | 971 |
| Value | Count | Frequency (%) | |
| False | 18841 | 95.1% | |
| True | 971 | 4.9% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Unnamed: 0 | type_property | type_subproperty | price | locality | netHabitableSurface | nr_bedrooms | kitchen_installed | nr_facades | hasGarden | garden_m2 | hasTerrace | terrace_m2 | furnished_YN | swimpool_YN | type_of_sale | land | basement | building | fireplaceExist | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8901695 | HOUSE | MIXED_USE_BUILDING | 295000 | 4180 | 242 | 3 | True | 0 | True | 1000 | True | 36 | False | False | isNewPrice | 1403 | 1403 | GOOD | False |
| 1 | 8747010 | HOUSE | VILLA | 675000 | 8730 | 349 | 4 | True | 0 | True | 977 | False | 0 | False | False | isNewPrice | 1526 | 1526 | AS_NEW | False |
| 2 | 8775843 | HOUSE | APARTMENT_BLOCK | 250000 | 4020 | 303 | 5 | True | 0 | False | 0 | False | 0 | False | False | isNewPrice | 760 | 760 | TO_RENOVATE | False |
| 3 | 8910441 | HOUSE | HOUSE | 545000 | 1200 | 235 | 4 | True | 0 | False | 0 | False | 0 | True | False | NaN | 63 | 63 | JUST_RENOVATED | False |
| 4 | 8758672 | HOUSE | MIXED_USE_BUILDING | 500000 | 1190 | 220 | 2 | True | 0 | True | 60 | False | 0 | False | False | NaN | 193 | 193 | AS_NEW | False |
| 5 | 8725100 | COMMERCIAL | MIXED_USE_BUILDING_COMMERCIAL | 229500 | 4500 | 230 | 0 | True | 0 | False | 0 | False | 0 | False | False | isNewPrice | 128 | 128 | TO_BE_DONE_UP | False |
| 6 | 8940340 | HOUSE | HOUSE | 189000 | 4040 | 200 | 3 | True | 0 | True | 40 | False | 0 | False | False | isNewClassified | 100 | 11 | TO_BE_DONE_UP | False |
| 7 | 8923626 | HOUSE | MIXED_USE_BUILDING | 465000 | 4540 | 400 | 4 | True | 0 | False | 0 | False | 0 | False | False | isUnderOption | 312 | 312 | GOOD | False |
| 8 | 8913667 | HOUSE | APARTMENT_BLOCK | 650000 | 1150 | 200 | 4 | True | 0 | True | 150 | True | 4 | False | False | isUnderOption | 301 | 301 | GOOD | False |
| 9 | 8713285 | OFFICE | BUILDING | 350000 | 7090 | 700 | 0 | False | 0 | False | 0 | True | 0 | False | False | NaN | 540 | 540 | TO_RESTORE | False |
Last rows
| Unnamed: 0 | type_property | type_subproperty | price | locality | netHabitableSurface | nr_bedrooms | kitchen_installed | nr_facades | hasGarden | garden_m2 | hasTerrace | terrace_m2 | furnished_YN | swimpool_YN | type_of_sale | land | basement | building | fireplaceExist | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19802 | 8367880 | APARTMENT | APARTMENT | 189000 | 4000 | 122 | 3 | True | 0 | False | 0 | False | 0 | False | False | isNewlyBuilt | 0 | 122 | AS_NEW | False |
| 19803 | 8727088 | APARTMENT | APARTMENT | 120000 | 8430 | 36 | 1 | True | 0 | False | 0 | False | 0 | False | False | NaN | 0 | 36 | GOOD | False |
| 19804 | 8881075 | APARTMENT | APARTMENT | 267500 | 9120 | 101 | 2 | True | 0 | False | 0 | True | 22 | False | False | isNewPrice | 0 | 101 | GOOD | False |
| 19805 | 8903751 | APARTMENT | PENTHOUSE | 975000 | 1000 | 0 | 2 | True | 0 | False | 0 | True | 80 | False | False | isNewlyBuilt | 0 | 6 | AS_NEW | False |
| 19806 | 8863083 | APARTMENT | APARTMENT | 208000 | 8300 | 50 | 1 | True | 0 | False | 0 | True | 10 | False | False | NaN | 0 | 50 | JUST_RENOVATED | False |
| 19807 | 8876673 | APARTMENT | APARTMENT | 480000 | 1040 | 102 | 2 | True | 0 | False | 0 | True | 13 | True | False | NaN | 0 | 3 | AS_NEW | False |
| 19808 | 8948452 | APARTMENT | APARTMENT | 130000 | 6887 | 84 | 1 | True | 0 | False | 0 | False | 0 | False | False | isNewlyBuilt | 0 | 84 | JUST_RENOVATED | False |
| 19809 | 8887317 | APARTMENT | APARTMENT | 335000 | 1200 | 89 | 2 | True | 0 | False | 0 | True | 12 | False | False | isNewClassified | 0 | 89 | AS_NEW | False |
| 19810 | 8944979 | APARTMENT | APARTMENT | 98000 | 4480 | 63 | 3 | True | 0 | False | 0 | False | 0 | False | False | isNewClassified | 0 | 7 | TO_RENOVATE | False |
| 19811 | 8913656 | APARTMENT | APARTMENT | 195000 | 4000 | 96 | 2 | True | 0 | False | 0 | True | 0 | False | False | NaN | 0 | 96 | GOOD | False |
Most frequent
| Unnamed: 0 | type_property | type_subproperty | price | locality | netHabitableSurface | nr_bedrooms | kitchen_installed | nr_facades | hasGarden | garden_m2 | hasTerrace | terrace_m2 | furnished_YN | swimpool_YN | type_of_sale | land | basement | building | fireplaceExist | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8009981 | HOUSE | APARTMENT_BLOCK | 725000 | 2018 | 360 | 4 | True | 0 | False | 0 | False | 0 | False | False | isUnderOption | 160 | 160 | JUST_RENOVATED | False | 2 |
| 1 | 8016112 | HOUSE | EXCEPTIONAL_PROPERTY | 820000 | 1702 | 367 | 4 | True | 0 | True | 9143 | True | 30 | False | False | isNewPrice | 9143 | 9143 | GOOD | False | 2 |
| 2 | 8035441 | HOUSE | MIXED_USE_BUILDING | 450000 | 6760 | 259 | 5 | True | 0 | False | 0 | False | 0 | False | False | isUnderOption | 115 | 115 | JUST_RENOVATED | False | 2 |
| 3 | 8040133 | HOUSE | MIXED_USE_BUILDING | 125000 | 4020 | 125 | 3 | True | 0 | False | 0 | False | 0 | False | False | isUnderOption | 60 | 60 | TO_RENOVATE | False | 2 |
| 4 | 8088715 | HOUSE | MIXED_USE_BUILDING | 395000 | 6767 | 648 | 4 | True | 0 | False | 0 | False | 0 | False | False | isUnderOption | 1788 | 1788 | JUST_RENOVATED | False | 2 |
| 5 | 8122926 | HOUSE | MIXED_USE_BUILDING | 2850000 | 1200 | 2300 | 15 | True | 0 | False | 0 | True | 40 | False | False | isUnderOption | 1633 | 1633 | TO_RENOVATE | False | 2 |
| 6 | 8196608 | HOUSE | MIXED_USE_BUILDING | 149000 | 4860 | 345 | 5 | True | 0 | False | 0 | True | 25 | False | False | isUnderOption | 0 | 30 | TO_RENOVATE | False | 2 |
| 7 | 8205295 | HOUSE | MIXED_USE_BUILDING | 500000 | 4130 | 325 | 3 | True | 0 | False | 0 | True | 120 | False | False | isUnderOption | 300 | 300 | GOOD | False | 2 |
| 8 | 8349953 | HOUSE | MIXED_USE_BUILDING | 495000 | 7000 | 540 | 6 | True | 0 | False | 0 | True | 0 | False | False | isUnderOption | 0 | 540 | GOOD | False | 2 |
| 9 | 8383911 | HOUSE | MIXED_USE_BUILDING | 129000 | 7000 | 0 | 0 | False | 0 | False | 0 | False | 0 | False | False | isUnderOption | 0 | 0 | TO_RENOVATE | False | 2 |